A Comprehensive Study of Learning Approaches for Author Gender Identification
نویسندگان
چکیده
In recent years, author gender identification is an important yet challenging task in the fields of information retrieval and computational linguistics. this paper, different learning approaches are presented to address problem for Turkish articles. First, several classification algorithms applied list representations based on paradigms: fixed-length vector such as Stylometric Features (SF), Bag-of-Words (BoW) distributed word/document embeddings Word2vec, fastText Doc2vec. Secondly, deep architectures, Convolution Neural Network (CNN), Recurrent (RNN), special kinds RNN Long-Short Term Memory (LSTM) Gated Unit (GRU), C-RNN, Bidirectional LSTM (bi-LSTM), GRU (bi-GRU), Hierarchical Attention Networks Multi-head (MHA) designated their comparable performances evaluated. We conducted a variety experiments achieved outstanding empirical results. To conclude, ML with BoW have promising fast-Text also probably suitable between embedding models. This comprehensive study contributes literature utilizing ways representations. It first attempt identify applying SF, DNN architectures language.
منابع مشابه
Author gender identification from text using Bayesian Random Forest
Nowadays high usage of users from virtual environments and their connection via social networks like Facebook, Instagram, and Twitter shows the necessity of finding out shared subjects in this environment more than before. There are several applications that benefit from reliable methods for inferring age and gender of users in social media. Such applications exist across a wide area of fields,...
متن کاملapplying transitivity theory to gender analysis of efl textbook: : a comparative study.
efl/esl textbooks have been regarded as essential language teaching materials with which the learners spend about 70 up to 90 percent of their class time. the important role they play and their vast use make them not only influential in learning the language but also in shaping values and attitudes. put it another way, textbooks socialize learners using their contents (i.e. texts, illustrations...
15 صفحه اولa study on thermodynamic models for simulation of 1,3 butadiene purification columns
attempts have been made to study the thermodynamic behavior of 1,3 butadiene purification columns with the aim of retrofitting those columns to more energy efficient separation schemes. 1,3 butadiene is purified in two columns in series through being separated from methyl acetylene and 1,2 butadiene in the first and second column respectively. comparisons have been made among different therm...
a study of translation of english litrary terms into persian
چکیده هدف از پژوهش حاضر بررسی ترجمه ی واژه های تخصصی حوزه ی ادبیات به منظور کاوش در زمینه ی ترجمه پذیری آنها و نیز راهکار های به کار رفته توسط سه مترجم فارسی زبان :سیامک بابایی(1386)، سیما داد(1378)،و سعید سبزیان(1384) است. هدف دیگر این مطالعه تحقیق در مورد روش های واژه سازی به کار رفته در ارائه معادل های فارسی واژه های ادبی می باشد. در راستای این اهداف،چارچوب نظری این پژوهش راهکارهای ترجمه ار...
15 صفحه اولA Study on Author Identification through Stylometry
Electronic communication is one of the popular ways of communication in this era. E-mail communication is the most popular way of electronic communication. Internet works as the backbone for these communications. In digital forensics, questions is arises that the authors of documents and the author identity, demographic background is linked to other documents or not. So identification of the au...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Information Technology and Control
سال: 2022
ISSN: ['1392-124X', '2335-884X']
DOI: https://doi.org/10.5755/j01.itc.51.3.29907